Data Science in Marketing

Hui Lin

2020-03-26

About Me

Before 2018-05

HTML5 Icon

The pyramid of data needs illustrated by Monica Rogati

Before 2018-05

HTML5 Icon

After 2018-05

HTML5 Icon

This is how I feel

HTML5 Icon

After 2018-05

HTML5 Icon

Case Study: Group Lasso Logistic Regression for Customer Retention

Project Cycle

Business Questions

  1. How likely will a customer purchase?
  2. What are the key drivers?

Clarification Questions

Refined Question

Project Cycle

Data Preprocessing

Project Cycle

Multivariate Logistic Regression

\[ln\mathcal{L}(\boldsymbol{\beta}|\mathbf{y})=\sum_{i=1}^{n}\left\{ y_{i}ln\frac{1}{1+exp(-\mathbf{x_{i}}^{T}\mathbf{\boldsymbol{\beta}})}+(1-y_{i})ln\left[1-\frac{1}{1+exp(-\mathbf{x_{i}}^{T}\boldsymbol{\beta})}\right]\right\} \]

\[D(\boldsymbol{\beta})\equiv\frac{\partial ln\mathcal{L}(\boldsymbol{\beta}|\mathbf{y})}{\partial\boldsymbol{\beta}}=\sum_{i=1}^{n}\left\{ y_{i}-\frac{1}{exp(-\mathbf{x_{i}}^{T}\boldsymbol{\beta})}\right\} \mathbf{x_{i}}\]

Lasso: Weighted L1-norm Penalty [Tibshirani 1996]

Group Lasso Logistic Regression

\[\mathcal{S}_{\lambda}(\beta)=-l(\beta)+\lambda\sum_{g=1}^{G}s(df_{g})\parallel\beta_{g}\parallel_{2}\]

where \(l(\mathbf{\beta})\) is log-likelihood:

\[\Sigma_{i=1}^{n}\{y_{i}\eta_{\beta}(\mathbf{x_{i}})-log[1+exp(\eta_{\beta}(\mathbf{x_{i}}))]\}\]

\(\lambda\) tuning parameter for penalty and \(s(\centerdot)\) is \(s(df_{g})=df_{g}^{0.5}\)

Performance Measure

\[\lambda_{max}=max_{g\in {1,\dots,G}}{\frac{1}{s(df_{g})}\parallel \mathbf{x_{g}^{T}(y-\bar{y})}\parallel_{2}}\]

Model Training and Testing

Cut-off Tuning

  1. Ordered the score from high to low
  2. Calculate the sensitivity and specificity as the cutoff changes
  3. Get the cut-off values with corresponding likelihoods

Project Cycle

Model Comparison

Essentially, all models are wrong, but some are useful.

Project Cycle

Marketing Project Overview - Predictive Analytics

Marketing Project Overview - Segmentation

Marketing Project Overview- Program and Service Analysis

  1. Customer connection program
  2. Marketing Pilot Program (Sampling strategy, experimental design)
  3. Promotion Program
    • Control group based: control for some or all of the potentially confounding influence of pretreatment control variables by reducing imbalance between the treated and control groups. After preprocessing in this way, any method of analysis that would have been used without matching can be applied to estimate causal effects
    • Coarsened Exact Matching (CEM)[http://gking.harvard.edu/cem/]
    • PSM
  4. Causal Inference (Difficult!!)

Marketing Project Overview - Social Media Analysis

Marketing Project Overview - Others